Multilingual Code-switching Identification via LSTM Recurrent Neural Networks
نویسندگان
چکیده
This paper describes the HHU-UH-G system submitted to the EMNLP 2016 Second Workshop on Computational Approaches to Code Switching. Our system ranked first place for Arabic (MSA-Egyptian) with an F1-score of 0.83 and second place for Spanish-English with an F1-score of 0.90. The HHU-UHG system introduces a novel unified neural network architecture for language identification in code-switched tweets for both SpanishEnglish and MSA-Egyptian dialect. The system makes use of word and character level representations to identify code-switching. For the MSA-Egyptian dialect the system does not rely on any kind of language-specific knowledge or linguistic resources such as, Part Of Speech (POS) taggers, morphological analyzers, gazetteers or word lists to obtain state-ofthe-art performance.
منابع مشابه
Multilingual Recurrent Neural Networks with Residual Learning for Low-Resource Speech Recognition
The shared-hidden-layer multilingual deep neural network (SHL-MDNN), in which the hidden layers of feed-forward deep neural network (DNN) are shared across multiple languages while the softmax layers are language dependent, has been shown to be effective on acoustic modeling of multilingual low-resource speech recognition. In this paper, we propose that the shared-hidden-layer with Long Short-T...
متن کاملInternal Memory Gate for Recurrent Neural Networks with Application to Spoken Language Understanding
Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN) require 4 gates to learn shortand long-term dependencies for a given sequence of basic elements. Recently, “Gated Recurrent Unit” (GRU) has been introduced and requires fewer gates than LSTM (reset and update gates), to code shortand long-term dependencies and reaches equivalent performances to LSTM, with less processing time during ...
متن کاملData-Driven Forecasting of High-Dimensional Chaotic Systems with Long-Short Term Memory Networks
We introduce a data-driven forecasting method for high dimensional, chaotic systems using Long-Short Term Memory (LSTM) recurrent neural networks. The proposed LSTM neural networks perform inference of high dimensional dynamical systems in their reduced order space and are shown to be an effective set of non-linear approximators of their attractor. We demonstrate the forecasting performance of ...
متن کاملDependency Parsing of Code-Switching Data with Cross-Lingual Feature Representations
This paper describes the test of a dependency parsing method which is based on bidirectional LSTM feature representations and multilingual word embedding, and evaluates the results on monoand multilingual data. The results are similar in all cases, with a slightly better results achieved using multilingual data. The languages under investigation are Komi-Zyrian and Russian. Examination of the r...
متن کاملLanguage Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks
Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) have recently outperformed other state-of-the-art approaches, such as i-vector and Deep Neural Networks (DNNs), in automatic Language Identification (LID), particularly when dealing with very short utterances (∼3s). In this contribution we present an open-source, end-to-end, LSTM RNN system running on limited computational resources...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016